Trace driven cache simulation is central to computer design. A trace is a very long sequence of reference lines from main memory. At the t(exp th) instant, reference x sub t is hashed into a set of cache locations, the contents of which are then compared with x sub t. If at the t sup th instant x sub t is not present in the cache, then it is said to be a miss, and is loaded into the cache set, possibly forcing the replacement of some other memory line, and making x sub t present for the (t+1) sup st instant. The problem of parallel simulation of a subtrace of N references directed to a C line cache set is considered, with the aim of determining which references are misses and related statistics. A simulation method is presented for the Least Recently Used (LRU) policy, which regradless of the set size C runs in time O(log N) using N processors on the exclusive read, exclusive write (EREW) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/log N processors. Timings are presented of the second algorithm's implementation on the MasPar MP-1, a machine with 16384 processors. A broad class of reference based line replacement policies are considered, which includes LRU as well as the Least Frequently Used and Random replacement policies. A simulation method is presented for any such policy that on any trace of length N directed to a C line set runs in the O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation.
展开▼
机译:跟踪驱动的缓存模拟对于计算机设计至关重要。轨迹是来自主存储器的非常长的参考线序列。在第t(expth)时刻,将参考x sub t散列到一组缓存位置中,然后将其内容与x sub t比较。如果在第t个瞬间x sub t在高速缓存中不存在,则称它为未命中,并被加载到高速缓存集中,可能会强制替换某些其他存储行,并使x sub t存在持续(t + 1)个瞬间。考虑了定向到C行缓存集的N个参考的子轨迹的并行模拟问题,目的是确定哪些参考是未命中和相关的统计信息。提出了一种用于最近最少使用(LRU)策略的仿真方法,该方法在专用读取,专用写入(EREW)并行模型上使用N个处理器在时间O(log N)的情况下对设置大小C进行无级运行。给出了使用N / log N处理器以O(C log N)时间运行的更简单的LRU仿真算法。给出了在具有16384个处理器的MasPar MP-1上第二算法实现的时间。考虑了广泛的基于参考的线路替换策略,其中包括LRU以及最少使用和随机替换策略。针对任何这样的策略,提出了一种仿真方法,即使用EREW模型上的N个处理器,在指向C行集的长度为N的任何迹线上以O(C log N)时间运行的可能性很高。该算法简单,空间开销很小,非常适合SIMD实现。
展开▼